Spectral Entropy Feature in Multi-Stream for Robust ASR

نویسندگان

  • Hemant Misra
  • Hervé Bourlard
چکیده

In recent papers, entropy computed from sub-bands of the spectrum was used as a feature for automatic speech recognition. In the present paper, we further study the sub-band spectral entropy features which can give the flatness/peakiness of the sub-band spectrum and in turn the position of the formants in the spectrum. The sub-band spectral entropy features are used in hybrid hidden Markov model/artificial neural network systems and are found to be noise robust. The spectral entropy features are investigated along with PLP features in multi-stream combination. Separate multi-layer perceptrons (MLPs) are trained for PLP features, spectral entropy features and both the features concatenated. The output posteriors of the three MLPs are combined after weighting such that the weight to a particular MLP’s outputs are inversely proportional to the entropy of the output posterior distributions of that MLP. In Tandem framework, the combined output, after decorrelation, is fed to standard hidden Markov model/Gaussian mixture model system. Significant improvement in performance is reported when spectral entropy features are used along with PLP features in multi-stream combination.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Spectral entropy feature in full-combination multi-stream for robust ASR

In a recent paper, we reported promising automatic speech recognition results obtained by appending spectral entropy features to PLP features. In the present paper, spectral entropy features are used along with PLP features in multi-stream framework. In our multistream hidden Markov model/artificial neural network system, we train a separate multi-layered perceptron (MLP) for PLP features, spec...

متن کامل

New entropy based combination rules in HMM/ANN multi-stream ASR

Classifier performance is often enhanced through combining multiple streams of information. In the context of multistream HMM/ANN systems in ASR, a confidence measure widely used in classifier combination is the entropy of the posteriors distribution output from each ANN, which generally increases as classification becomes less reliable. The rule most commonly used is to select the ANN with the...

متن کامل

Recognition using speech synthesis : a reactive dynamic for robust ASR

Automatic Speech Recognition (ASR) systems are not efficient under noisy speech. In the Multi-Stream (MS) approach, commonly used to reinforce ASR robustness, each stream feeds one recognizer generating estimates which are combined through a fusion process. As some streams are optimal for transmission of some phonemes [1,3], it is then interesting to over weight the best stream during the featu...

متن کامل

Multi-stream ASR: an oracle perspective

Multi-stream based automatic speech recognition (ASR) systems are usually shown to outperform single stream systems, specially in noisy test conditions. And, indeed, there is a trend today in ASR towards using more and more acoustic features combined at the input (early integration, possibly preceded by some linear or nonlinear transformation) or later in the recognition process (e.g., at the l...

متن کامل

Towards Robust and Adaptive Speech Recognition Models

In this paper, we discuss a family of new Automatic Speech Recognition (ASR) approaches, which somewhat deviate from the usual ASR approaches but which have recently been shown to be more robust to nonstationary noise, without requiring specific adaptation or “multi-style” training. More specifically, we will motivate and briefly describe new approaches based on multi-stream and subband ASR. Th...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005